Identifying Evolutionary Topic Temporal Patterns Based on Bursty Phrase Clustering
نویسندگان
چکیده
We discuss a temporal text mining task on finding evolutionary patterns of topics from a collection of article revisions. To reveal the evolution of topics, we propose a novel method for finding key phrases that are bursty and significant in terms of revision histories. Then we show a time series clustering method to group phrases that have similar burst histories, where additions and deletions are separately considered, and time series is abstracted by burst detection. In clustering, we use dynamic time warping to measure the distance between time sequences of phrase frequencies. Experimental results show that our method clusters phrases into groups that actually share similar bursts which can be explained by real-world events.
منابع مشابه
Analyzing the Temporal Dynamics of the News Cycle
We explore the dynamics of the evolutionary patterns in the popularity of topics. To measure the popularity quantitatively, we regard the quoted phrase as a carrier of a topic, and track the frequency of its quotation over time. Our motivation starts from finding distinctive patterns that arise in the temporal behavior of the popularity. To find these patterns, we formulate a time series cluste...
متن کاملBursty Feature Representation for Clustering Text Streams
Text representation plays a crucial role in classical text mining, where the primary focus was on static text. Nevertheless, well-studied static text representations including TFIDF are not optimized for non-stationary streams of information such as news, discussion board messages, and blogs. We therefore introduce a new temporal representation for text streams based on bursty features. Our bur...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملCLEar: A Real-time Online Observatory for Bursty and Viral Events
We describe our demonstration of CLEar (Clairaudient Ear), a real-time online platform for detecting, monitoring, summarizing, contextualizing and visualizing bursty and viral events, those triggering a sudden surge of public interest and going viral on micro-blogging platforms. This task is challenging for existing methods as they either use complicated topic models to analyze topics in a off-...
متن کاملMining Spatial and Temporal Movement Patterns of Passengers on Bus Networks
The analysis of human behavior is the basis of understanding many social phenomena. Accurate and reliable human movement pattern mining can lead to instructive insight to transport management, urban planning and location-based services (LBS). As one of the most widely used forms of transportation, buses can tell a lot of stories about people, including passenger demands, areas people are intere...
متن کامل